AD-A224  697 


m  .  iLE  COPY 


ONR/RR.90-4 


DIFFERENTIAL  WEIGHT  PROCEDURE  OF 
THE  CONDITIONAL  P.D.F.  APPROACH 
FOR  ESTIMATING  THE  OPERATING 
CHARACTERISTICS  OF 
DISCRETE  ITEM  RESPONSES 


DTIC 

ELECTE 


JUL  3 1 1990 


FUMKO  SAMEJIMA 


UNIVERSITY  OF  TENNESSEE 
KNOXVILLE,  TENN.  87996-0900 


JUNE,  1990 


Prepared  under  the  contract  number  N00014-87-K-OS20, 
4421-549  with  the 

Cognitive  Science  Reeearch  Program 
Cognitive  and  Neural  Sciences  Division 
Office  of  Naval  Research 


Approved  for  public  release;  distribution  unlimited. 
Reproduction  in  whole  or  in  part  is  permitted  for 
any  purpose  of  the  United  States  Government. 

ROM069-11-001-91 


QO  07  30  12# 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 


REPORT  DOCUMENTATION  PAGE 


lb  RESTRICTIVE  MARKINGS 


form  Approved 

OMBNo  0704  0 tee 


la  REPORT  security  CLASSIFICATION 
1 


2a  SECURITY  CLASSIFICATION  AUTHORITY 


2b  DECLASSIFICATION /DOWNGRADING  SCHEDULE 


4  PERFORMING  ORGANIZATION  REPORT  NUMBER(S) 


6a  NAME  OF  PERFORMING  ORGANIZATION 

Fumiko  Samejima,  Ph.D. 
Psychology  Department 


6c.  ADDRESS  (C/ty,  State,  and  ZIP  Code) 

310B  Austin  Peay  Building 
The  University  of  Tennessee 
Knoxville,  TN  37996-0900 


8a  NAME  OF  FUNDING /SPONSORING 

ORGANIZATION  Cognitive  Science 


6b  OFFICE  SYMBOL 
0^  4pphcsbfe) 


3  Distribution /availability  op  report 

Approved  for  public  release; 
Distribution  unlimited 


5  MONITORING  ORGANIZATION  REPORT  NuMBER(S) 


a.J^AME.OE  MONITORING  ORGANIZATION 

Cognitive  Science 


Logmti  ve 
1142  CS 


Research  Program 


Sb  OFFICE  SYMBOL 
(If  tpplictbit) 


7b  ADDRESS  (Oty.  State,  and  ZIP  Code) 

Office  OT  Naval  Research 
800  N.  Quincy  Street 
Arlington,  VA  22217 


9  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 

N00014-87-K-0320 


10  SOURCE  OF  FUNDING  NUMBERS 


PROGRAM 

PROJECT 

TASK 

WORK  UNIT 

ELEMENT  NO 

NO 

NO 

ACCESSION  NO 

61153N 

RR-042-04 

042-04-01 

4421-549 

Be.  ADDRESS  (Cit)/,  State,  and  ZIP  Code) 

Office  of  Naval  Research 
800  N.  Quincy  Street 
Arlington.  VA  22217 


1 1  TITLE  (IfKiudt  Security  CUsstficition) 

Differential  Weight  Procedure  of  the  Conditional  P.D.F.  Approach  for 
operating  characteristics  of  discrete  item  responses 


12  PERSONAL  AUTHOR(S) 

_ Fumiko  Samejima,  Ph.D. 


13a  TYPE  OF  REPORT 

_ technical  report 


16  SUPPLEMENTARY  NOTATION 


13b  TIME  COVERED 


I  14  DATE  OF  REPORT  (Yt»r.  Month.  0*y)  |15  PAGE  COUNT 

TO  1990  June  25,  1990  I  36 


COSATI  COOES 


GROUP 


SUB-GROUP 


18  SUBJECT  TERMS  {Conttnut  on  reverse  if  necessery  end  identify  by  block  number) 

Latent  Trait  Models,  Mental  Test  Theory, 

Operating  Characteristics,  Nonparametric 
Estimation,  Discrete  Item  Responses  _ 


19  abstract  (Conr/nuc  on  revarsa  if  ntfosury  and  idtntify  by  block  number) 


A  new  procedure  of  nonpsrametrk  eetimution  of  the  operating  characteriatica  of  diacrete  item  re- 
aponaea  haa  been  propoaed,  and  it  ia  called  Differential  Weight  Procedure  of  the  Conditional  P.D.F. 
Approach.  Some  examplea  have  been  given,  and  aenaitivitiea  of  the  reaulting  eatimated  operating 
characteriatica  to  irregularitiea  of  the  differential  weight  fhnetiona  have  been  obaerved  and  diacuaaed. 
Uaefulneaaea  of  the  method  have  alao  been  diacuaaed.  Theae  outcomes  auggeat  the  importance  of  further 
investigation  of  the  weight  function  in  the  future. 


20  distribution /AVAILABILITY  OF  ABSTRACT 
SI  UNCLASSIFIED/UNLIMITED  □  SAME  AS  RPT  Q  OTIC  USERS 


224  NAME  OF  RESPONSIBLE  INDIVIDUAL 

Dr.  Charles  E.  Davis _ 


OD  Form  1473,  JUN  8€  Previous  editions  ere  obsolete 

S/N  0102-LF-014-6603 


21  ABSTRACT  SECURITY  CLASSIFICATION 


jgijr 


22c  OFFICE  SYMBOL 

0MR-1142-CS 


security  Classification  of  th.s  pag; 


TABLE  OF  CONTENTS 


Page 


1  Introduction  i 

2  Common  Backgrounds  and  Differences  among  Different 

Procedures  1 

3  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Ap¬ 
proach  Combined  with  the  Normal  Approach  Method  5 

4  Differential  Weight  Procedure  5 

5  Examples  7 

6  Sensitivities  to  Irregularities  of  Weight  Functions  16 

7  Usefulnesses  of  the  Differential  Weight  Procedure  24 

8  Discussion  and  Conclusions  24 


Af. C '■  I.  ' 

NT"  IS  '..'h-’-ScI 
D '  IC  T  Ab 
U^'anno'jnced 
JuSffflCot.T'-. 

By  .  . 

Di'l-iL’  .1  '  ' 

r  'J-S 
Di:>t 


I 

I 


The  research  was  conducted  at  the  principal  investigator’s  laboratory,  405  Austin  Peay  Bldg.,  De¬ 
partment  of  Psychology,  University  of  Tennessee,  Knoxville,  Tennessee.  Those  who  worked  as  assistants 
for  this  research  include  Christine  A.  Golik,  Barbara  A.  Livingston,  Lee  Hai  Gan  and  Nancy  H.  Domm. 


1  Introduction 


In  the  past  couple  of  decades  the  author  has  been  engaged  in  the  nonparametric  estimation  of  the 
operating  characteristics  of  discrete  item  responses  in  the  context  of  latent  trait  models  (cf.  Samejima, 
1981b,  1988).  As  early  as  in  1977  the  author  proposed  Normal  Approximation  Method  (Samejima, 
1977b)  which  can  be  used  for  the  item  calibration  both  in  computerised  adaptive  testing  and  in  paper- 
and-pencil  testing.  She  also  discussed  the  effective  use  of  information  functions  in  adaptive  testing 
(Samejima,  1977a).  Since  then,  with  the  support  by  the  Office  of  Naval  Research,  she  has  developed 
several  approaches  and  methods  for  the  same  purpose  (cf.  Samejima,  1977c,  1978a,  1978b,  1978c, 
1978d,  1978e,  1978f,  1980a,  1980b,  1981a;  Samejima  and  Changas,  1981).  For  convenience,  they  can  be 
categorised  as  follows. 


Approaches 

Methods 

(1)  Bivariate  P.D.F.  Approach 

(2)  Histogram  Ratio  Approach 

(3)  Curve  Fitting  Approach 

(4)  Conditional  P.D.F.  Approach 

(1)  Pearson  System  Method 

(2)  Two-Parameter  Beta  Method 

(3)  Normal  Approach  Method 

(4)  Lognormal  Approach  Method 

(4.1)  Simple  Sum  Procedure 

(4.2)  Weighted  Sum  Procedure 

(4.3)  Proportioned  Sum  Procedure 

Here  by  an  approach  we  mean  a  general  procedure  in  approaching  the  operating  characteristics  of  a 
discrete  item  response,  and  by  a  method  we  mean  a  specific  method  in  approximating  the  conditional 
density  of  ability,  given  its  maximum  likelihood  estimate.  Thus  a  combination  of  an  approach  and  a 
method  provides  us  with  a  specific  procedure  for  estimating  the  operating  characteristic  of  a  discrete 
item  response. 

These  approaches  and  methods  are  characterised  by  two  features,  i.e., 

(1)  estimation  is  made  without  assuming  any  mathematical  forms  for  the  operating 
characteristics  of  discrete  item  responses,  and 

(2)  estimation  is  efficient  enough  to  base  itself  upon  a  relatively  small  set  of  data  of,  s'>y, 
several  hundred  to  a  few  thousand  examinees. 

The  present  paper  proposes  a  method  which  increases  accuracies  of  estimation  cf  the  operating 
characteristics  of  discrete  item  responses,  especially  when  the  true  operating  character’.>tic  is  represented 
by  a  steep  curve,  and  also  at  the  lower  and  upper  ends  of  the  ability  distribution  v  here  the  estimation 
tends  to  be  inaccurate  because  of  smaller  numbers  of  subjects  involved  in  the  base  data.  Tentatively, 
it  is  called  the  Differential  Weight  Procedure,  and  it  belongs  to  the  Conditional  P.D.F.  Approach.  This 
procedure  costs  more  CPU  time  than  the  Simple  Sum  Procedure,  which  has  been  used  frequently  (cf. 
Samejima,  1981b,  1988),  but  the  advantage  of  handling  more  than  one  item,  say,  fifty,  together  in  the 
Conditional  P.D.F.  Approach  is  still  there. 


II  Common  Backgrounds  and  Differences  among  Different 
Procedures 

Let  9  be  ability,  or  latent  trait,  which  assumes  any  real  number.  We  assume  that  there  is  a  set 
of  test  items  measuring  9  whose  characteristics  are  known.  This  set  of  test  items  is  called  Old  Test, 
whose  meaning  is  somewhat  close  to  the  original  itempool  in  the  adaptive  testing  situation. 
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Let  h  denote  an  item  of  the  Old  Test,  kh  be  a  discrete  item  response  to  item  h  ,  and  Pk^  (^) 
be  the  operating  characteristic  of  kh  ,  or  the  conditional  probability  assigned  to  kh  ,  given  6  .  We 
assume  that  is  three-times  diflferentiable  with  respect  to  6  .  We  have  for  the  item  response 

information  function,  ,  (Samejima,  1969,  1972) 

(2.1)  hAd)  =  -^^og PkAe)  , 

and  the  item  information  function,  fh(^)  ,  is  defined  as  the  conditional  expectation  of  fki(^)  >  given 
9  ,  such  that 

(2.2)  /a(«)  =  £^[/fc.(e)|^l  =  X^/fc,(«)P*.((?)  . 

kk 


Let  V  be  a  response  pattern  such  that 

(2.3)  V  =  {khy  h=l,2,...,n  . 

The  operating  characteristic,  P\/(9)  ,  of  the  response  patten  V  is  defined  as  the  conditioned  probability 
of  V  ,  given  9  ,  and  by  virtue  of  local  independence  we  can  write 

(2.4)  ^/(«)=  n  ^*-^(^)  • 

kK<V 

The  response  pattern  information  function,  Iv{9)  ,  i«  given  by 

(2.5)  Iv{9)  =  -^logP,(9)  =  Yi  > 

kKtV 

and  the  test  information  function,  1(9)  ,  is  defined  as  the  conditional  expectation  of  Iv  (9)  ,  given  9  , 
and  we  obtain  from  (2.1),  (2.2),  (2.3),  (2.4)  and  (2.5) 

n 

(2.6)  1(9)  =  E\Iv(9)  \9]  =  Y  /v((’)/V(«)  =  Y  ^'*(^)  ' 

V  h=l 

For  the  sake  of  simplicity  in  handling  mathematics,  the  tentative  transformation  of  to  r  is  made 
by 

(2.7)  r  =  Cfi  r  [/(t)l»/2d(  +  Co  , 

J  —OO 

where  Co  is  an  arbitrary  constant  for  adjusting  the  origin  of  r  ,  and  Ci  is  an  arbitrary  constant 
which  equals  the  square  root  of  the  test  information  functions,  /*(r)  ,  of  r  ,  so  that  we  can  write 

(2.8)  C,  =  [r(r)li/2 

for  all  r  .  This  transformation  will  be  simplified  if  we  use  a  polynomial  approximation  to  the  square 
root  of  the  test  information  function,  [f(^)l*^^  ,  in  the  least  squares  sense  which  is  accomplished  by 
using  the  method  of  moments  (cf.  Samejima  and  Livingston,  1979)  for  the  meaningful  interval  of  r  . 
Thus  (2.7)  can  be  changed  to  the  form 


2 


(2.9) 


r 


*=0 

m+1 

=  £«;;'?* . 

fc=0 


where  a*  (k  =  0, 1, 
square  root  of  1(0) 

. . . ,  m)  is  the  k  -la  coefficient  of  the  polynomial  of  degree  m  approximating  the 
,  and  qI  is  the  new  k  'th  coefficient  which  is  given  by 

II 

?»- 

II 

o 

(2.10) 

a.?  ' 

1 

=  (Ci4:)"^afc_i  A:  =  1, 2, ..  .,m+ 1  . 

V 

With  this  transformation  of  5  to  r  and  by  virtue  of  (2.8),  we  can  use  the  asymptotic  normality  with 
the  two  parameters,  r  and  Cf '  ,  as  the  approximation  to  the  conditional  distribution  of  the  maximum 
likelihood  estimator  f  ,  given  its  true  value  r  (cf.  Samejima,  1981b).  Then  the  first  through  fourth 
conditional  moments  of  r  ,  given  f  ,  can  be  obtained  from  the  density  function,  g"(f)  ,  of  f  and 
from  the  constant  Ci  by  the  following  four  formulae  (cf.  Samejima,  1981b): 

(2.11) 

f;(r  1  f)  =  f +  Cf2^1ogp*(f)  , 

(2.12) 

Var.(r  1  f)  =  Cl  2(1  + Cl  log p‘(f)]  , 

(2.13) 

^[(r-£(r|f)}2|fl  =  Cf«l^logff*(f)l 

and 

(2.14)  E{{r-E{r\f)}*\f]  =  Cr^(3  +  log  <,*(?)}  + log  g*(f)}2 

+C:*{^  logger}}]  . 

This  density  function,  g*(f)  ,  can  be  estimated  by  fitting  a  polynomial,  using  the  method  of  moments 
(cf.  Samejima  and  Livingston,  1979),  as  we  did  in  the  transformation  of  5  to  t  ,  based  upon  the 
empirical  set  of  f 's  .  Note  that  in  the  above  formulae  the  first  moment  is  about  the  origin,  while  the 
other  three  are  about  the  mean. 

The  two  coefficients,  and  ^2  >  and  Pearson’s  criterion  k  are  obtained  by 


(2.15) 

^ 

(2.16) 

Ms  ^ 
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and 


(2.17)  «  =  ^i02+3)2[4(2^2-3^i -6)(4^2-3^i)1-^  , 

by  substituting  H2  ,  f^3  and  fXt  by  Var.(r\f),  E\{t  —  E(t  \  f)}^  \  f]  and  E\{t  —  E[t  \  f))*  \  f] 
respectively,  which  are  obtained  by  formulae  (2.12),  (2.13)  and  (2.14). 

In  the  Bivariate  P.D.F.  Approach,  we  approximate  the  bivariate  distribution  of  the  transformed 
latent  trait  r  and  its  maximum  lihelihood  estimate  f  for  each  subpopulation  of  examinees  who  share 
the  same  discrete  item  response  to  a  specified  item.  Thus  the  procedure  must  be  repeated  as  many  times 
as  the  number  of  discrete  item  response  categories  for  each  separate  item.  It  is  rather  a  time-consuming 
approach,  and  the  CPU  time  for  the  item  calibration  increases  almost  proportionally  to  the  number  of 
new  items. 

In  contrast  to  this.  Conditional  P.D.F.  Approach  deals  with  the  total  population  of  subjects,  and 
all  the  items  together.  Effort  is  focused  upon  the  approximation  of  the  conditional  distribution  of  r  , 
given  r  ,  for  the  total  population  of  examinees,  and  then  the  result  is  branched  into  separate  discrete 
item  response  subpopulations  for  each  item. 

If  we  compare  the  two  approaches  with  each  other,  therefore,  we  can  say  that  Bivariate  P.D.F. 
Approach  is  an  orthodox  approach,  while  Conditional  P.D.F.  Approach  needs  an  assumption  that  the 
conditional  distribution  of  r  ,  given  f  ,  is  unaffected  by  the  different  subpopulations  of  examinees. 
While  this  assumption  can  only  be  tolerated  in  most  cases,  the  latter  approach  has  two  big  advantages  in 
the  sense  that  the  CPU  time  required  in  item  calibration  is  substantially  less,  and  that  it  does  not  have 
to  deal  with  subgroups  of  small  numbers  of  subjects  in  approximating  the  joint  bivariate  distributions 
of  r  and  f  . 

In  each  of  these  two  approaches,  we  can  choose  one  of  the  four  methods  listed  earlier  in  estimating 
the  bivariate  density  of  r  and  f  ,  or  the  conditional  density  of  r  ,  given  its  maximum  likelihood 
estimate  f  .  In  so  doing,  in  the  Pearson  System  Method,  we  use  all  four  conditional  moments  of 
r  ,  given  f  ,  which  are  estimated  through  the  formulae  (2.11)  through  (2.14),  and,  using  Pearson’s 
criterion  k  ,  which  is  given  by  (2.17),  one  of  the  Pearson  System  density  functions  is  selected.  In  the 
Two-Parameter  Beta  Method  two  of  the  four  parameters  of  the  Beta  density  function,  i.e.,  the  lower 
and  upper  endpoints  of  the  interval  of  r  for  which  the  Beta  density  is  positive,  are  a  priori  given,  and 
the  other  two  parameters  are  estimated  by  using  the  first  two  conditional  moments  of  r  ,  given  f  , 
which  are  provided  by  (2.11)  and  (2.12),  respectively.  In  the  Normal  Approach  Method,  again  we  use 
only  the  first  two  conditional  moments  of  r  ,  given  f  ,  as  the  first  and  second  parameters  of  the  normal 
density  function. 

If  we  compare  these  three  methods,  it  will  be  appropriate  to  say  that  both  Two-Parameter  Beta 
Method  and  Normal  Approach  Method  are  simpler  versions  of  Pearson  System  Method.  And  yet  the 
latter  two  methods  have  an  advantage  of  using  only  the  first  two  estimated  conditional  moments  of 
r  ,  given  f  ,  whereas  the  former  requires  the  additional  third  and  fourth  conditional  moments,  whose 
estimations  are  less  accurate  compared  with  those  of  the  first  two  conditional  moments.  If  we  compare 
the  Two-Parameter  Beta  Method  with  the  Normal  Approach  Method,  we  will  notice  that  the  former 
allows  non-symmetric  density  functions,  while  the  latter  does  not.  This  is  an  advantage  of  the  Two- 
Parameter  Beta  Method  over  the  Normal  Approach  Method,  and  yet  the  former  has  the  disadvantage 
of  the  requirement  that  two  of  the  four  parameters  should  a  priori  be  set. 

Lognormal  Approach  Method  was  developed  later,  which  uses  up  to  the  third  conditional  moment 
and  allows  more  flexibilities  in  the  shape  of  the  conditional  distribution  of  r  ,  given  f  ,  than  the  Normal 
Approach  Method.  It  was  intended  that  a  happy  medium  between  the  Pearson  System  Method  and  the 
Normal  Approach  Method  would  be  realized,  in  the  effort  of  ameliorating  the  disadvantages  of  these 
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two  methods  and  of  keeping  their  separate  advantages. 


Ill  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Ap¬ 
proach  Combined  with  the  Normal  Approach  Method 

It  is  obvious  from  the  discussion  given  in  the  preceding  section  that  the  Conditional  P.D.F.  Ap¬ 
proach  combined  with  the  Normal  Approach  Method  is  the  simplest  and  one  of  the  most  economical 
procedures  in  CPU  time.  Out  of  the  three  procedures  of  the  Conditional  P.D.F.  Approach  the  Simple 
Sum  Procedure  is  the  simplest  one  (cf.  Samejima,  1981b).  For  this  reason,  the  combination  of  the 
Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach  and  the  Normal  Approach  Method  h<is 
most  frequently  been  applied  for  simulated  and  empirical  data.  Fortunately,  in  spite  of  the  simplicity  of 
the  procedure,  the  results  with  simulated  data  in  the  adaptive  testing  situation  and  with  simulated  and 
empirical  data  in  the  paper-and-pencil  testing  situation  indicate  that  we  can  estimate  the  operating 
characteristics  fairly  accurately  by  using  this  combination  (cf.  Samejima,  1981b,  1984).  This  seems 
to  prove  the  robustness  of  the  Conditional  P.D.F.  Approach.  For  one  thing,  there  b  a  good  reason 
why  Normal  Approach  Method  works  well,  for  the  conditional  distribution  of  r  ,  given  f  ,  is  indeed 
normal  if  the  (unconditional)  distribution  of  r  is  normal,  and  it  is  a  truncated  normal  distribution  if 
the  (unconditional)  distribution  of  r  is  rectangular,  and  the  truncation  is  negligible  for  most  of  the 
conditional  distributions. 

In  the  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach,  the  operating  characteristic, 
Pkg{d)  ,  of  the  discrete  item  response  kg  of  an  unknown  item  g  is  estimated  through  the  formula 

(3.1)  A,(«)  =  P;;,\r(8)]  =  XI  I  "•)!■'  - 

*ek,  $=i 

where  s  (=  1,2, ...  ,N)  indicates  an  individual  examinee,  and  ^(r  |  f,)  denotes  the  conditional  density 
of  r  ,  given  .  This  conditional  density  is  estimated  by  using  the  estimated  conditional  moments  of 
r  ,  Riven  f,  ,  using  one  of  the  four  methods,  as  was  described  in  the  preceding  section. 

In  the  Weighted  Sum  Procedure  of  the  Conditional  P.D.F.  Approach,  we  have  for  the  estimated 
operating  characteristic  of  kg 


(3.2)  Pk,(B)  =  Pkjriff))  =  X  «'(^)<^(’'  !  I  ‘ 

•€fcp  »=1 

where  iu(f,)  is  the  weight  function  of  f,  .  When  we  combine  one  of  these  two  approaches  with 
the  Normal  Approach  Method,  ^(r  |  f,)  in  (3.1)  or  in  (3.2)  is  approximated  by  the  normal  density 
function,  using  the  first  two  estimated  conditional  moments  of  r  ,  given  f,  ,  which  are  given  by  (2.11) 
and  (2.12),  respectively,  as  its  parameters,  /if,  and  <7f.  ,  in  the  formula 

(3.3)  <b(T  I  f.)  =  [27r]-‘''2[af.]“‘  expl-(r  -  /ir.)*/{2<7?,}]  . 

IV  Differential  Weight  Procedure 

If  we  accept  the  approximation  of  the  conditional  distribution  of  f  ,  given  r  ,  by  the  asymptotic 
normality,  as  we  do  in  these  approaches  (cf.  Samejima,  1981b),  the  other  conditional  distribution,  i.e., 
that  of  r  ,  given  f  ,  will  become  more  or  less  incidental.  Thus  in  the  Bivariate  P.D.F.  Approach 
the  bivariate  distribution  of  r  and  f  is  approximated  for  each  separate  item  score  subpopulation  of 
subjects  of  each  unknown  test  item.  In  the  Conditional  P.D.F.  Approach,  however,  the  incidentality 
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of  this  second  conditional  distribution  is  not  rigorously  considered,  and  the  implicit  assumption  exists 
such  that  for  the  fixed  value  of  f  the  conditional  distributions  of  r  are  similar  for  the  different  item 
score  subpopulations. 

Take  the  dichotomous  response  level,  for  example.  On  this  level,  each  item  is  scored  “right”  or 
“wrong”,  “affirmative”  or  “negative”,  etc.  The  above  assumption  of  non- incident ality  may  be  acceptable 
when  the  operating  characteristic  of  the  correct  answer  of  the  item  is  represented  by  a  mildly  steep  curve, 
as  is  the  case  with  most  practical  situations,  and  the  questions  are  asked  to  subjects  whose  ability  levels 
are  compatible  with  the  difficulty  levels  of  the  questions,  as  is  the  case  with  adaptive  testing  and,  though 
less  rigorously,  with  many  cases  of  paper-and-pencil  testing. 

This  assumption  is  not  acceptable,  however,  when  the  operating  characteristic  of  the  correct  answer 
is  represented  by  a  steep  curve.  If  the  operating  characteristic  follows  the  Guttman  scale,  for  example, 
then  the  conditional  distributions  of  r  ,  given  f  ,  for  the  two  separate  item  score  subpopulations 
are  distinctly  separated,  and  they  do  not  even  overlap!  If  we  use  the  Simple  Sum  Procedure  or  the 
Weighted  Sum  Procedure  for  an  item  which  nearly  follows  the  Guttman  scale,  therefore,  the  resulting 
estimated  operating  characteristics  of  the  correct  and  the  incorrect  answers  wiU  tend  to  be  flatter  tham 
they  actually  are. 

This  problem  can  be  solved  by  estimating  differential  conditional  distributions  of  r  ,  given  f  ,  for 
the  separate  discrete  item  responses  to  an  “unknown”  item.  Let  (r  |  f)  denote  the  conditional 
density  of  r  ,  given  f  ,  for  the  subpopulation  of  subjects  who  share  the  same  discrete  item  response 
kg  to  an  “unknown”  item  g  .  We  can  write 

(^•1)  4>k,{r  \  f)  f;^{T)  rl){f  \  t)  \9l^(f)]-^  , 

where  indicates  the  density  of  r  for  the  subpopulation  of  subjects  who  share  kg  as  their 

common  item  score  of  item  g  ,  \  r)  is  the  conditional  density  of  r  ,  given  r  ,  which  is  approximated 

by  the  normal  density,  n[r,  ,  and  yJJ  (f)  is  the  marginal  density  of  f  ,  for  this  subpopulation, 

and  for  which  we  have 


(4.2) 


fkj^)  I  0  dr 


We  notice  that  there  is  a  relationship 

(4.3)  f:jr)  =  r(r)  P^Jr)  [T  /‘(r)  P;^(r)  dr}'^  , 

y — oo 

where  /*(r)  denotes  the  density  of  r  for  the  total  population.  Since  we  have 


(4.4) 


I  f)  =  r{r)  I  ’■)  [3*(^)i  '  , 


where  g*{f)  is  the  density  of  f  for  the  total  population  of  subjects,  as  wets  described  in  the  preceding 
section,  which  is  given  by 


(4.5) 


/*(t)  rli{f\r)  dr 


from  the  above  formulae  we  obtain 
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(4.6) 


(  f)  =  4>{t  I  f)  p;^(r)  h(f) 


where  h(f)  is  a  function  of  f  and  constant  for  a  fixed  value  of  f  .  Thus  4k, {f  |  f)  is  a  density 
function  proportional  to  ^(r  |  f)  (r)  .  We  notice  that  ^(r  |  r)  in  this  formula  is  common  to  all 
the  item  scores  and  across  different  unknown  items,  while  (r)  is  a  specific  function  of  r  for  each 
kg  .  Since  ^(r  |  f )  can  be  estimated  by  one  of  the  four  methods  described  earlier,  our  effort  should  be 
focused  on  finding  an  appropriate  differential  weight  function  for  each  kg  .  Let  Wfc,(r)  denote  such 
a  differential  weight  function,  which  replaces  (r)  h{f)  in  (4.6).  Thus  we  can  revise  (3.1)  and  (3.2) 
into  the  forms 

(4.7)  A,(^)  =  =  E  1  I 

4€k,  t  =  l 

and 

(4.8)  Pjc,(&)  =  Pfc*,(r(d))  =  E  Mfs)Vt'k,(r)4lr  |  f.)E  I  • 

sek,  4=1 

Since  the  differential  weight  function  (r)  involves  P^^  (r)  ,  which  itself  is  the  target  of  estimation, 
we  may  use  its  estimate,  Pk^(f)  ,  obtained  by  the  Simple  Sum  Procedure  or  by  the  Weighted  Sum 
Procedure,  as  its  substitute.  In  so  doing  we  may  need  some  local  smoothings  of  P^  (r)  where  the 
estimation  involves  substantial  amounts  of  error  because  of  locally  small  numbers  of  subjects  in  the  base 
data,  etc.  In  some  cases  we  may  need  several  iterations  by  renewing  the  differential  weight  functions 
on  each  stage  until  the  resulting  estimated  operating  characteristic  converges. 


V  Examples 

We  have  tried  this  proposed  method  on  the  simulated  data  provided  by  Dr.  Charles  Davis  of 
the  Office  of  Naval  Research,  using  the  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach 
combined  with  the  Normal  Approach  Method  with  some  modifications  as  the  initial  estimate  of  Pk,(r) 
in  the  differential  weight  function.  These  data  are  simulated  on-line  item  calibration  data  of  the  initial 
itempool  calibration  based  upon  conventional  testing,  in  which  100  dichotomous  items  are  divided 
into  four  subtests  of  25  items  each,  and  each  subtest  has  been  administered  to  6,000  hypothetical 
examinees,  and  those  of  different  rounds  based  upon  adaptive  testing,  in  which  each  of  the  50  new 
binary  items  has  been  administered  to  a  subgroup  of  1,  500  hypothetical  subjects  out  of  the  total 
of  15,000  .  These  hypothetical  examinees’  ability  distributes  unimodally  within  the  interval  of  8  , 
(-3. 0,3.0),  with  slight  negative  skewness. 

For  the  purpose  of  illustration,  Figure  5-1  presents  the  results  of  the  Differential  Weight  Procedure 
using  the  results  of  the  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach  combined  with  the 
Normal  Approach  Method  with  some  modifications  as  the  initial  estimates,  for  eight  items  of  the  initial 
itempool.  They  are  dichotomous  items,  and  were  intentionally  selected  from  those  items  whose  true 
operating  characteristics  of  the  correct  answer  are  non-monotonic,  in  order  to  visualize  the  benefit  of  the 
nonparametric  estimation  of  the  operating  characteristic.  In  each  graph,  also  presented  for  comparison 
is  the  best  fitted  operating  characteristic  of  the  correct  answer  following  the  three-parameter  logistic 
model,  which  has  been  given  by  Dr.  Michael  Levine.  The  logistic  model  on  the  dichotomous  level  is 
represented  by 
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FIGURE  5-1 


Eight  Examples  of  the  Elstimated  Operating  Characteristic  of  the  Correct  Answer 
Using  the  Differential  Weight  Procednre  (Dotted  Line),  in  Comparison  with 
the  lyne  Operating  Characteristic  (Solid  Line)  and  the  Best  Fitted 
Three-Parameter  Logistic  Carve  (Dashed  Line). 
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(5.1) 


Pg{e)  =  [l  +  ^Xp{-Dag{e  -bg)}\  ^ 


where  Pg(6)  denotes  the  operating  characteristic  of  the  correct  answer  to  item  g,  Og  and  bg  are  the 
item  discrimination  and  difficulty  parameters,  respectively,  and  Z7  is  a  scaling  factor  which  is  usually 
set  equal  to  1.7  .  We  can  see  in  these  graphs  that  the  resulting  estimated  operating  characteristics  are 
fairly  close  to  the  true  ones,  and  that  they  reflect  the  non-monotonicities. 

VI  Sensitivities  to  Irregularities  of  Weight  Functions 

As  we  have  proceeded,  several  factors  have  been  identified  and  observed  which  affect  the  resulting 
estimated  operating  characteristics  substantially.  They  are  concerned  with  the  differential  weight  func¬ 
tion,  and  can  be  itemized  as:  1)  lower  end  ambiguities,  2)  upper  end  ambiguities,  3)  local  irregularities 
and  4)  overall  irregularities. 

Out  of  these  factors,  lower  and  upper  end  ambiguities  basically  come  from  the  fact  that  we  do  not 
usuaUy  have  sufficiently  large  numbers  of  subjects  on  the  lowest  and  the  highest  ends  of  the  interval 
of  d  of  interest  upon  which  the  estimation  of  the  operating  characteristics  is  made.  Also  the  fact  that 
the  test  information  function  1(9)  is  used  in  the  transformation  of  9  to  r  which  is  specified  by 
(2.7)  may  have  something  to  do  with  these  ambiguities.  It  has  been  observed  (Samejima,  1979b)  that 
in  using  equivalent  items  following  the  Constant  Information  Model  (Samejima,  1979a)  the  speed  of 
convergence  of  the  conditional  distribution  of  the  maximum  likelihood  estimate  6  ,  given  B  ,  to  the 
asymptotic  normality  with  6  and  (/(^)1“*^^  as  its  two  parameters  substantially  differs  for  different 
levels  of  9  ,  in  spite  of  the  fact  that  the  amount  of  test  information  is  constant  for  every  level  of  9  . 
To  be  more  specific,  the  convergence  is  observed  to  be  much  slower  at  those  levels  which  are  close  to 
either  end  of  the  interval  of  6  for  which  the  amount  of  test  information  is  non-zero  and  constant,  and 
faster  at  intermediate  levels  of  9  .  This  situation  can  be  ameliorated  if  we  replace  the  test  information 
function  1(9)  in  (2.7)  by  one  of  its  two  modified  forms  (cf.  Samejima,  1990a).  We  can  write  for  the 
Modification  Formula  No.  1,  T(5)  , 

(6.1)  r(9)  =  1(9)11+ ^B(9y  19)]-^  , 

which  is  the  reciprocal  of  an  approximate  minimum  bound  of  the  variance  of  the  maximum  likelihood 
estimator,  where  B(9v  |  ^)  is  the  MLE  bias  function  of  the  test  consisting  of  items  with  any  discrete 
item  responses  kf,  .  In  the  general  case  of  discrete  item  responses,  we  can  write  for  the  bias  function 
of  the  maximum  likelihood  estimate 

(6.2)  B(9y  |0)  =  E\9v-9\9\  =  -(1/2)1/(<1)1- 

h=i  ki, 

where,  as  before,  Pk^i^)  “  operating  characteristic  of  the  discrete  response  kf^  ,  and  P^i^(^) 
and  fX'^(^)  denote  the  first  and  second  partial  derivatives  of  Pk^(9)  with  respect  to  9  ,  respectively. 
Modification  Form.  No.  2,  5(ff)  ,  is  given  by 

(6.3)  E(^)  =  1(9)  {(1  +  ^B(9y  I  9)]^  +  1(9)  [5(^V  |  9)]^}-^  , 

which  is  the  reciprocal  of  an  approximate  minimum  bound  of  the  mean  squared  error  of  the  maximum 
likelihood  estimator.  When  the  MLE  bias  function  of  the  test  is  monotone  increasing,  as  is  the  case  in 
many  situations,  it  is  obvious  from  (6.1),  (6.2)  and  (6.3)  that  we  have  the  relationship. 
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FIGURE  6-1 

Seven  Examplea  of  the  Estironted  Operating  Characteriatic  of  the  Correct  Anawer 
Using  the  Differential  Weight  Procedure  (Dotted  Line),  in  Comparison  with  the 
ll-ne  Operating  Characteristic  (Solid  Line),  When  the  Differential  Weight 
Faction  (Short  Dashed  Line)  Has  Irregularities.  The  FVinction  Was  Also 
Proportionally  Enlarged  and  Plotted  (Long  Dashed  Line)  to  Visualise 
the  Angles  and  Other  Irregularities  WelL 
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(6.4) 


E(tf)  <  T(^)  < /(^)  , 


where  the  first  inequality  in  (6.4)  aJways  holds  regardless  of  the  shape  of  the  MLE  bias  function.  Which 
one  of  the  two  modified  test  information  functions  is  more  appropriate  to  use  depends  upon  the  situation, 
and  we  need  more  investigation  to  answer  this  question. 

By  irregularity  we  mean  non-smoothness,  which  is  exemplified  by  an  unnatural  angle,  etc.  It  has 
been  observed  that  for  most  items  the  resulting  operating  characteristic  is  amazingly  sensitive  to  these 
irregularities  of  the  differential  weight  function.  In  order  to  observe  these  sensitivities,  Figure  6-1 
illustrates  how  these  irregularities,  which  are  involved  in  the  differential  weight  function,  affect  the 
resulting  estimated  operating  characteristic. 

The  effect  of  local  irregularities  is  most  interesting  to  observe  in  these  examples  presented  by  Figure 
6-1.  In  each  of  these  graphs,  the  artificially  irregular  differential  weight  function  for  the  correct  answer 
is  drawn  by  a  short  dashed  line,  and,  in  order  to  emphasize  its  irregularities,  it  was  proportionally 
enlarged  and  shown  by  a  long  dashed  line.  We  can  see  in  each  graph  that,  when  the  differential  weight 
function  has  an  unnatural  angle,  for  example,  the  resulting  estimated  operating  characteristic  of  the 
correct  answer  also  shows  an  unnatural  angle  at  approximately  the  same  level  of  6  .  We  can  also  see  in 
these  graphs  how  overall  irregularities  of  the  differential  weight  function  affect  the  resulting  estimated 
operating  characteristic,  and  how  sensitive  the  latter  is  to  the  former.  This  type  of  sensitivity  of  the 
resulting  estimated  operating  characteristic  to  the  irregularities  of  the  differential  weight  function  is 
encouraging  as  well  as  threatening,  for  it  promises  success  in  the  estimation  provided  that  we  succeed 
in  finding  the  right  differential  weight  function. 

VII  Usefulnesses  of  the  Differential  Weight  Procedure 

It  is  obvious  that  item  analysis  in  the  true  sense  of  the  word  starts  from  the  accurate  estimation  of 
the  operating  characteristics  of  the  item  responses.  Thus  the  nonparametric  estimation  of  the  operating 
characteristic  offers  a  great  deal  of  information  about  an  item,  when  it  is  successful.  In  this  sense  we 
can  say  that  the  Differential  Weight  Procedure  provides  us  with  promise  for  the  successful  item  analysis 
in  general. 

Following  this,  we  can  conceive  of  many  applications  of  the  method  for  different  purposes.  To  give 
some  examples,  it  will  be  especially  useful  for  the  on-line  item  calibration  in  computerized  adaptive 
testing;  also  it  will  be  useful  in  the  revision  of  multiple-choice  test  items  in  order  to  reduce  the  effect  of 
nobe  and  to  make  the  ability  estimation  efficient  (cf.  Samejima,  1990b). 


VIII  Discussion  and  Conclusions 

A  new  procedure  of  nonparametric  estimation  of  the  operating  characteristics  of  discrete  item  re¬ 
sponses  has  been  proposed,  and  it  is  called  Differential  Weight  Procedure  of  the  Conditional  P.D.F. 
Approach.  Some  examples  have  been  given,  and  sensitivities  of  the  resulting  estimated  operating 
characteristics  to  irregularities  of  the  differential  weight  functions  have  been  observed  and  discussed. 
Usefulnesses  of  the  method  have  also  been  discussed. 

These  outcomes  suggest  the  importance  of  further  investigation  of  the  weight  function  in  the  future. 

To  summarize,  although  Simple  Sum  Procedure  of  the  Conditional  P.D.F.  Approach  combined  with 
the  Normal  Approach  Method  works  reasonably  well  for  the  on-line  item  calibration  of  adaptive  testing, 
and  also  for  the  paper-and-pencil  testing,  especially  when  the  number  of  subjects  is  large,  if  we  wish 
to  increase  the  accuracy  of  estimation  we  can  use  the  Differential  Weight  Procedure.  The  disadvantage 
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will  be  the  added  CPU  time,  so  we  need  to  consider  the  balance  of  the  cost  and  accuracy  of  estimation 
before  we  make  our  decision.  It  will  be  less  expensive,  however,  if  we  compare  the  CPU  time  required 
for  the  present  procedure  with  the  time  required  for  the  Bivariate  P.D.F.  Approach. 
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